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Abstract 

A lower bound on the secrecy capacity of the wiretap channel with state information available 
causally at both the encoder and decoder is established. The lower bound is shown to be strictly larger 
than that for the noncausal case by Liu and Chen. Achievability is proved using block Markov coding, 
Shannon strategy, and key generation from common state information. The state sequence available at 
the end of each block is used to generate a key, which is used to enhance the transmission rate of the 
confidential message in the following block. An upper bound on the secrecy capacity when the state is 
available noncausally at the encoder and decoder is established and is shown to coincide with the lower 
bound for several classes of wiretap channels with state. 

I. Introduction 

Consider the 2-receiver wiretap channel with state depicted in Figure 1 . The sender X wishes to send 
a message to the legitimate receiver Y while keeping it asymptotically secret from the eavesdropper 
Z. The secrecy capacity for this channel can be defined under various scenarios of state information 
availability at the encoder and decoder. When the state information is not available at either party, the 
problem reduces to the classical wiretap channel for the channel averaged over the state and the secrecy 
capacity is known [1], [2]. When the state is available only at the decoder, the problem reduces to the 
wiretap channel with augmented receiver (Y, S). 
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Fig. 1: Wiretap channel with State 
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The interesting scenarios to consider therefore are when the state information is available at the encoder 
and may or may not be available at the decoder. This raises the question of how the encoder and decoder 
can make use of the state information to increase the secrecy rate. In [3], Chen and Vinck established 



a lower bound on the secrecy capacity when the state information is available noncausally only at the 
encoder. The lower bound is established using a combination of Gelfand-Pinsker coding and Wyner 
wiretap coding. Subsequently, Liu and Chen [4] used the same techniques to establish a lower bound on 
the secrecy capacity when the state information is available noncausally at both the encoder and decoder. 
In a related direction, Khisti, Diggavi, and Wornell [5] considered the problem of secret key agreement 
first studied in [6] and [7] for the wiretap channel with state and established the secret key capacity when 
the state is available causally or noncausally at the encoder and decoder. The key is generated in two 
parts; the first using a wiretap channel code while treating the state sequence as a time-sharing sequence, 
and the second part is generated from the state itself. 

In this paper, we consider the wiretap channel with state information available causally at the encoder 
and decoder. We show that the lower bound for the noncausal case in [4] is achievable when only causal 
state information is available. Our achievability scheme, however, is quite different from that for the 
noncausal case. We use block Markov coding, Shannon strategy for channels with state [8], and secret 
key agreement from state information, which builds on the work in [5]. However, unlike [5], we are not 
directly interested in the size of the secret key, but rather in using the secret key generated from the 
state sequence in one transmission block to increase the secrecy rate in the following block. This block 
Markov scheme causes additional information leakage through the correlation between the secret key 
generated in a block and the received sequences at the eavesdropper in subsequent blocks. Although a 
similar block Markov coding scheme was used in [9] to establish the secrecy capacity of the degraded 
wiretap channel with rate limited secure feedback, in their setup no information about the key is leaked 
to the eavesdropper because the feedback link is assumed to be secure. 

We also establish an upper bound on the secrecy capacity of the wiretap channel with state information 
available noncausally at the encoder and decoder. We show that the upper bound coincides with the 
aforementioned lower bound for several classes of channels. Thus, the secrecy capacity for these classes 
does not depend on whether the state information is known causally or noncausally at the encoder. 

The rest of the paper is organized as follows. In Section II, we provide the needed definitions. In 
Section III, we summarize and discuss the main results in the paper. The proofs of the lower and upper 
bounds are detailed in Sections IV and V, respectively. 

II. Problem Definition 

Consider a discrete memoryless wiretap channel (DM-WTC) with discrete memoryless state (DM) 
{X x S,p(y, z\x, s)p(s), y, Z) consisting of a finite input alphabet X, finite output alphabets y, Z, a 
finite state alphabet S, a collection of conditional pmfs p(y, z\x, s) on y x Z, and a pmf p(s) on the state 
alphabet S. The sender X wishes to send a confidential message M £ [1 : 2 nR ) to the receiver Y while 
keeping it secret from the eavesdropper Z with either causal or noncausal state information available at 
both the encoder and decoder. 

A (2 nR ,n) code for the DM-WTC with causal state information at the encoder and decoder consists 
of: (i) a message set [1 : 2 nR ], (ii) an encoder that generates a symbol Xi(m) according to a conditional 
pmf p(xi\m, s\ x* -1 ) for i € [1 : n]; and a decoder that assigns an estimate M or an error message to 
each received sequence pair (y n , s n ). We assume throughout that the message M is uniformly distributed 
over the message set. The probability of error is defined as P e = P{M ^ M}. The information leakage 
rate at the eavesdropper Z, which measures the amount of information about M that leaks out to the 
eavesdropper, is defined as Rl = i/(M; Z n ). A secrecy rate R is said to be achievable if there exists a 
sequence of codes with pj n ^ — > and Rl — > as n — > oo. The secrecy capacity Cs_csi is the supremum 
of the set of achievable rates. 
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We also consider the case when the state information is available noncausally at the encoder. The 
only change in the above definitions is that the encoder now generates a codeword X n {m) according 
to the conditional pmf p(x n \m, s n ), i.e., the stochastic mapping is allowed to depend on the entire state 
sequence instead of just the past and present state sequence. The secrecy capacity for this scenario is 
denoted by C s _ncsi- 

The notation used in this paper will follow that of El Gamal-Kim Lectures on Network Information 
Theory [10]. 

III. Summary of Main Results 

We summarize the results in this paper. Proofs are given in the following two sections and in the 
Appendix. 

Lower Bound 

The main result in this paper is the following lower bound on the secrecy capacity of the DM-WTC 
with causal state information available causally at both the encoder and decoder. 

Theorem 1: The secrecy capacity of the DM-WTC with state information available causally at the 
encoder and decoder is lower bounded as 

Cs-csi > max{ max min{I(V; Y\S) - I(V; Z\S) + H(S\Z),I(V; Y\S)}, 

p(v\s)p(x\v,s) 

max mm{H(S\Z,V),I(V;Y\S)} \ . (1) 

p(v)p(x\v,s) J 

Note that if S = 0, the above lower bound reduces to the secrecy capacity for the wiretap channel. Define 

i2s-CSi-i= max mm{I(V;Y\S)-I{V;Z\S) + H(S\Z),I(V;Y\S)}, 

p(v\s)p(x\v ,s) 

# s _csi-2= max mm{H(S\Z,V),I(V;Y\S)}. 

p(v)p(x\v,s) 

Then, (1) can be expressed as 

Cs-csi > max{i?s-csi-i,#S-CSi-2}- 

The proof of this theorem is detailed in Section IV. 

In [4], the authors established the following lower bound for the noncausal case 

Cs-ncsi > max (I(U; Y, S) - max{/([7; Z), I(U; S)}) 

p(u\s)p(x\u,s) 

max mm{I(U;Y\S)-I(U;Z\S) + I(S;U\Z),I(U;Y\S)}. (2) 

p(u\s)p(x\u,s) 

Clearly, .Rs-csi-i is at least as large as this lower bound. Hence, our lower bound (1) is at least as large 
as this lower bound (2). We now show that the lower bound (2) is as large as i?s-csi-i- 

Fix V € [0 : |V| - 1], p{v\s), and p(x\v, s) in i? S -csi-l- Let U G [0 : |V||«S| - 1] in bound (2). Define 
the conditional probability mass functions: For u = v + s|V|, let p(u\s) = p(v\s), p(x\u, s) = p(x\v, s), 
and let p(u\s) = p(x\u, s) = otherwise. Under this mapping, it is easy to see that H(S\Z, U) = and 
the other terms in (2) reduce to those in i?s-CSi-i- 

We now show that our lower bound (1) can be strictly larger than that for the noncausal case (2)). 
This is done via an example for which i?s-CSi-2 > -Rs-csi-u 
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Consider the channel in Figure 2, where X , y,Z,S € {0, 1} and p(y, z\x, s) = p(y, z\x) with channel 
transition probabilities as defined in the Figure. The state S is an i.i.d process that is observed by X and 
Y with H(S) = 1 - #(0.1). 

By setting V = X independent of S and P{X = 1} = P{X = 0} = 0.5 in JSs-csi-2, we obtain 

= 1 " #(0.1). 

We now show that i?s-csi-i is strictly smaller than 1 — #(0.1). First, note that 

J(V; Y\S) = H{Y\S) - H{Y\V, S) 
< H(Y) - H(Y\X) 
= I(X;Y) < 1-#(0.1). 

However, for i?s-CSi-i > 1 — #(0.1), we must have I(V;Y\S) > 1 — #(0.1). Hence, we must have 
I(V;Y\S) = 1 - #(0.1). Next, consider 

I(V; Y\S) = H(Y\S) - H(Y\V, S) 

(a) 

< l-H(Y\V,S) 

(b) 

< 1 -H(Y\V,S,X) 
= 1 -#(0.1) 

Step (a) holds with equality iff p(y\s) = 0.5 for all y,s € {0,1}. From the structure of the channel, 
this implies that p(x\s) = 0.5 for all x,s G {0,1}. Step (b) holds with equality iff H(Y\X,V,S) = 
H(Y\V, S), or equivalently I(X;Y\V, S) = 0. This implies that given V,S, X and Y are independent, 
p(x, y\v, s) = p(x\v, s)p(y\v, s). But since p(x, y\v, s) = p(x\v, s)p(y\x), either (i) p(x\v, s) = or (ii) 
p(y\v,s) = p{y\x) must hold. Now, consider the pair x = l,y = 1. Then, we must have either (i) 
p(x = l\v, s) = or (ii) p(y = l\v, s) = p(y = l\x = 1) = 0.9. In (i), X is a function of V and S. In 
(ii), we have 

p(y = l\v, s) = p(x = l\v, s)p(y = l\x = 1) + (1 — p(x = l\v, s))p(y = l\x = 0) 
= 0.9p(x = l\v, s) + 0.1 - 0.1p(x = l\v, s) 
= 0.8p(x = lit;, a) + 0.1. 

Using the fact that p(y = l\v,s) = 0.9, we have 0.8p(x = l\v, s) + 0.1 = 0.9 => p(x = l\v,s) = 1. 
This implies again that X is a function of V, S. In both cases (i) and (ii), we see that X is necessarily 
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a function of V and S, which implies that Z 

p(x\s) = p(z\s) = 0.5 for all x,s, we have 



X is also a function of V and S. Using the fact that 



I(V; Z\S) = H(Z\S) - H(Z\V, S) = H(X\S) = 1. 

The first expression in i?s-CSi-i is then upper bounded by 

I(V',Y\S) - I(V; Z\S) + H(S\Z) < I(V; Y\S) - I(V; Z\S) + H{S) 

= l-tf(O.l) - 1 + 1 - iT(O.l) 
= 1-2F(0.1) < l-F(O.l). 

This shows that i?g-csi-i < -Rs-csi-2> which completes the example. 

To illustrate the main ideas of the achievability proof of Theorem 1, we provide an outline for part 
of the proof of the rate expression i?s-csi-i- Using the functional representation lemma [11], we can 
show that it suffices to perform the maximization in i?s-csi-i over p(u) , p(x\v , s), and functions v(u, s). 
Thus, we prove achievability for the equivalent characterization of i?s-csi-i 
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max min{/([/; Y, S) - I(U; Z, S) + H{S\Z),I(U; Y, S)}. 



p(u) ,v(u,s) ,p(x\v ,s) 



(3) 



We will outline the proof for the case where I{U ;Y,S) — I(U; Z, S) > 0. Our coding scheme involves 
the transmission of b — 1 independent messages over b n-transmission blocks. We split the message Mj, 
j e [2 : b], into two independent messages M j0 € [1 : 2 nR °] and M jl <E [1 : 2 nRl ], where R + R x = R. 
The codebook generation consists of two steps. The first step is the generation of the message codebook. 
We randomly generate 2 nI ( U]Y ' S ^ u n (l) sequences and partition them into 2 nRo equal size bins. The 
codewords in each bin are further partitioned into 2 nRl equal size sub-bins C(mo,mi). The second step 
is to generate the key codebook. We randomly bin the set of state sequences s n into 2 nRl< bins B{k). 
The key used in block j is the bin index of the state sequence S(j — 1) in block j — 1. 

To send message Mj, Mji is encrypted with the key Kj-\ to obtain the index Mj x = Mji ® Kj-\. 
A codeword u n (L) is selected uniformly at random from sub-bin C(Mjo, Mji © -Kj-i) and transmitted 
using Shannon's strategy as depicted in Figure 3. The decoder uses joint typicality decoding together 
with its knowledge of the key to decode message Mj at the end of block j. Finally, at the end of block 
j, the encoder and decoder declare the bin index Kj of the state sequence s(j) as the key to be used in 
block j + To show that the messages can be kept asymptotically secret from the eavesdropper, note that 
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Fig. 3: Encoding in block j. 



MjQ is transmitted using Wyner wiretap coding. Hence, it can be kept secret from eavesdropper provided 
I(U ;Y, S) — I(U; Z, S) > 0. The key part of the proof is to show that the second part of the message 
Mji, which is encrypted with the key Kj-i, can be kept secret from the eavesdropper. This involves 
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showing that the eavesdropper has negligible information about Kj-\. However, the fact that Kj-i is 
generated from the state sequence in block j — 1 and used in block j results in correlation between it 
and all received sequences at the eavesdropper from subsequent blocks. We show that the eavesdropper 
has negligible information about -Kj-i given all its received sequences provided Rk < H(S\Z). 



Upper Bound 

We establish the following upper bound on the secrecy capacity of the wiretap channel with noncausal 
state information available at both the encoder and decoder (which holds also for the causal case). 

Theorem 2: The following is an upper bound to the secrecy capacity of the DM-WTC with state 
noncausally available at the encoder and decoder 

C S -^csi<mm{I(V 1 ;Y\U,S)-I(V 1 ;Z\U,S) + H(S\Z, U), I(V 2 ;Y\S)} . 

for some U, V\ and V2 such that p(u, v±, v 2 , x\s) = p{u\s)p{v\\u, s)p{v2\v\, s)p(x\v2, s). 

The proof of this theorem is given in Section V. 



Secrecy Capacity Results 

1. Following the lines of [3], we can show that Theorems 1 and 2 are tight for the following two special 
cases. 

(i) If there exists a V* such that maXp ( „| s)p(a .|„ )S) (/(F; Y \S) - I(V; Z\S) + H(S\Z)) = I(V*;Y\S) - 
I{V*;Z\S) + H(S\Z) and I(V*;Y\S) - I(V*;Z\S) + H(S\Z) < I(V*;Y\S), then the secrecy 
capacity is C s . C si = <? S -ncsi = I{V*; Y\S) - I(V*; Z\S) + H(S\Z). 

(ii) If there exists a V such that max p(vls)p{xl ^ s) I(V;Y\S) = I(V; Y\S) and I(V; Y\S) < I(V';Y\S)- 
I(V; Z\S) + H(S\Z), then the secrecy capacity is C S - C si = C s _ncsi = I(V; Y\S). 

2. We show that Theorems 1 and 2 are also tight when I(U; Y\S) > I(U; Z\S) for U such that (U, S) 
(X, S) — > (Y, Z) form a Markov chain, i.e., when Y is less noisy than Z for every state s£5 [12]. 

Theorem 3: The secrecy capacity for the DM-WTC with the state information available causally or 
noncausally at the encoder and decoder when Y is less noisy than Z is 

Cs-csi = Cs-ncsi = max min{/(X; Y\S) - I(X; Z\S) + H(S\Z), I(X; Y\S)}. 

p(x\s) 

Consider the special case when p(y, z\x, s) = p(y, z\x) and Z is a degraded version of Y, then Theorem 3 
specializes to the secrecy capacity for the wiretap channel with a key [13] 



Cs-csi = Cs- 



•NCSI 



= maxmin{/(X; Y) - I(X; Z) + H{S),I{X; Y)}. 

p(x) 

Achievability for Theorem 3 follows directly from Theorem 1 by setting V = X and observing that 
the expression reduces to i?s-csi-i since Y is less noisy than Z. To establish the converse, we use the 
less noisy assumption to strengthen the first inequality in Theorem 2 as follows 

I(Vl;Y\U, S) - I(Vr, Z\U, S) + H(S\Z, U) < I(Vr,Y\U, S) - I(Vx;Z\U, S) + H{S\Z) 

< I(Vv,Y\S) - I(Vi; Z\S) + H(S\Z) 

< I(X; Y\S) - I(X; Z\S) + H(S\Z), 
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where (a), (b) follow from the less noisy assumption. The proof of the second inequality follows by the 
data processing inequality: I(V 2 ;Y\S) < I(X;Y\S). 

3. Next, consider the case where p(y, z\x, s) = p(y, z\x) and the eavesdropper Z is less noisy [12] than 
Y. That is, I(U;Z) > I(U;Y) for every U such that U -> X -4 (Y,Z). Then, the capacity of this 
special class of channels is 

Cs-csi = Cs-ncsi = maxmm{H(S),I(X;Y)}. 

p(x) 

Achievability follows by setting V = X independent of S. The converse follows from Theorem 2 and 
the observation that since Z is less noisy than Y and p(y, z\x, s) = p(y, z\x), 

I(Vr, Y\U, S) - I(Vi; Z\U, S) + H(S\Z, U) < H{S\Z, U) 

< H(S), 

and I(V 2 ;Y\S) < I(X;Y). 

IV. Proof of Theorem 1 

We will prove the achievability of .Rs-csi-i an d -Rs-csi-2 separately. For i?s_csi-i> we W1 U prove 
the equivalent expression stated in equation 3. 

The proof of achievability for i?s-CSi-i is split into two cases (Cases 1 and 2) while i?g_csi-2 is 
proved in Case 3. 

Case 1: ife-csi-l with I(U; Y, S) > I{U; Z, S) 

Codebook generation: Split message Mj into two independent messages Mjo G [1 : 2 nRa ] and Mji € 
[1 : 2 nRl ], thus R = Rq + R\. Let R> R. The codebook generation consists of two steps. 

Message codeword generation: We randomly and independently generate 2 nR sequences u" (l), I G [1 : 
2 nR ], each according to nlLiP^*) partition them into 2 nRa equal-size bins C(mo), mo G [1 : 2 nR °\. 
We further partition the sequences within each bin C(mo) into 2 nRl< equal size sub-bins, C(mo,mi), 
mi G [1 : 2 nRl }. 

Key codebook generation: We randomly and uniformly partition the set of s n sequences into 2 nRli 
bins B(k), k G [1 : 2 nR «]. 

Both codebooks are revealed to all parties. 

Encoding: We send 6 — 1 messages over b n-transmission blocks. In the first block, we randomly 
select a sequence u n (L) G C(mio, m '\i)- The encoder then computes Vi = v(ui(L), Sj) and transmits a 
randomly generated symbol Xi ~ p{xi\si,Vi) for i G [1 : n]. At the end of the first block, the encoder 
and decoder declare k± G [1 : 2 nR,K ] such that s(l) G B(ki) as the key to be used in block 2. 

Encoding in block j G [2 : 6] proceeds as follows. To send message mj = (mjo,mji) and given key 
kj-\, the encoder computes = rriji © fej-i. To ensure secrecy, we must have i?i < Rk [14]. The 
encoder then randomly selects a sequence u n (L) G C(rrijo, m'-^). It then computes = v(ui(L), Sj) and 
transmits a randomly generated symbol X, ~ for i G [(j — l)n + 1 : jn]. 

Decoding and analysis of the probability of error: At the end of block j, the decoder declares that 
1 is sent if it is the unique index such that (u n (f), Y(j), S(j)) G % , otherwise it declares an error. It 
then finds the indices (rhjo^m^) such that u n (l) G C(myo>^i)- Finally, it recovers m^i by computing 
rhji = m'j X © kj-x- 
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To analyze the probability of error, let e" > e' > e > and define the following events for every 
j G [2 : 6]: 

£(j) = {Mj ± Mj}, 
£ 1 (j) = {(U n (L),S(j))£T?}, 
£ 2 (j) = {(U n (L),S(j),Y(j))^T^}, 
£sU) = {(U n (l),S(j),Y(j)) G 7? for some Z> L}. 
The probability of error is upper bounded as 

b 

P(£) = P{U<> =2 £(j)}<^P(£(j)). 

3=2 

Each probability of error term can be upper bounded as 

P(£(j)) < P(£,(j)) + P(£ 2 (j) n £t{j)) + P(£ 3 (j) n £ c 2 (j)). 

Now, P(£i(j)) as n -> oo by Law of Large Numbers (LLN) since P{(U n (L) G Te (n) )} ^ 1 as 
n — > oo and S(j) ~ niLi^( s «) = lTILi P( s i\ u i) by independence. The term Pf^C?) n £f(j)) — s- as 
n — > oo by LLN since (U n (L), S(j) G 7^™ and Y n ~ ELiLi^G/il^i; s *)- F° r tne l ast term, consider 

P(5 3 n,S 2 c O-))<^p(0^P{(C/ n (f),S(i),Y(i)) eT?,\£5V),L = l} 
1 tyl 

< ^ 2 - n ( / ( c/ ' y " 5 ')- <5 ( e ")) < 2 n (^- / ( f/ ' y ' 5 )+ 5 ( e ")) 

where (a) follows from: (i) L is independent of the transmission codebook sequences U n and the current 
state sequence S(j); and (ii) the conditional joint typicality lemma [10, Lecture 2]. Hence, P^n^fC?)) — > 
as n -> oo if < /(£/; F, 5) - 5(e"). 

Analysis of the information leakage rate: We use Z J to denote the eavesdropper's received sequence 
from blocks 1 to j and Z(j) to denote the received sequence in block j. We will need the following two 
results. 

Proposition 1: If R K < H(S\Z) - 45(e) and R > I(U;Z,S), then the following holds for every 
j G [1 : b]. 

1) fl^C) > - 5(e)). 

2) I(K f ,Z(j)\C) <2n5(e). 

3) 1(1^; Z*|C) < n5'(e), where 5(e) -4 and 5'(e) -> as e ->■ 0. 
The proof of this proposition is given in Appendix I. 

Lemma 1: [15] Let (U,V,Z) ~ p(u,v,z), R > and e > 0. Let C/ n be a random sequence 
distributed according to YYi=iP( u i)- Let V n (l), I G [1 : 2 nfi ], be a set of random sequences that are 
conditionally independent given U n and each distributed according to Y\7=i P( y i\ u i)- Let L be a random 
index with an arbitrary distribution over [1 : 2 nR ] independent of (U n ,V n (l)),l G [1 : 2 nR ]. Then, if 
P{([/ n ,V n (L),Z n ) e 7; (n) } -> 1 as n -> oo and i? > I(V;Z|17), there exists a 5(e) > 0, where 
5(e) -> as e -»• 0, such that H(L\Z n , U n ) < n(R - I(V; Z\U)) + n5(e). 
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We are now ready to upper bound the leakage rate averaged over codes. Consider 

b 

I(M 2 , M 3 , • • • , M b ; Z b \C) = £ I(Mj\ Z b \C, M b j+l ) 

3=2 

< £l(M i; Z 6 |C,S0),M* + i) 

i=2 
b 

^/(a^icso)), 

i=2 

where (a) follows by the independence of Mj and (S(j),M- +1 ), and (b) follows by the Markov Chain 
relation (Z b j+1 , M b +1 ,C) -> (Z*,S(j),C) -> (Mj,C). Hence, it suffices to upper bound each individual 
term I(Mj\ Z 3 \C, S(j)). Consider 

J(M,; Z*|C, S(j)) = W'o, M ji; ZP\C, S(j)) 

= I(M i0 , Mji; Z^C, S(j)) + /(M j0 , M n ; Z(j)\C, S(j), Z^ 1 ). 

Note that the first term is equal to zero by the independence of Mj and past transmissions, the codebook, 
and state sequence. For the second term, we have 

I(M j0 , M n ; Z(j)\C, SO"). zj ^) = I(M j0 ; Z(j)\C, S(j), Z^ 1 ) + I{M ji; Z(j)\C, M j0 , S(j), Z^ 1 ). 

We now bound the each term separately. Consider the first term 

I(M J0 ; Z(j)\C, SO"). Z'" 1 ) = Wo, L; Z(j)\C, S(j), V~ l ) - I(L; Z(j)\C, M j0 , S(j), Z^ 1 ) 

< I(U n ; Z(j) \C, S(i), Z^- 1 ) - H(L\C, M j0 , S(j), Z^ 1 ) 

+ H(L\Z(j),M j0 ,S(j)) 

n 

< Y^(H(Zi(j)\C, SiO')) " H(Zi(j)\C, Ui, SiO))) 

i=l 

- £T(L|C, Mj-o, SO), Z^'- 1 ) + H(L\Z(j), M j0 , S(j)) 

< Z|5) - £T(L|C, M, , SO), Z^ 1 ) + H(L\Z(j),M j0 , S(j)) 

(b) 

< nI(U; Z\S) - H(L\C, M j0 , S(j), 
+ n(R-R -I(U;Z, S) + 5(e)) 

( = } n(R - Bo) - H(L\C, M j0 , S(j), V~ l ) + n5(e) 
= n(R - R ) - H(Mji © K^\C, M j0 , S(j), V~ x ) 

- H(L\C, M j0 , S(j),Zi~\M 3l © Kj- X ) + n5{e) 

< n(R - Rq) - H(Mji © K^\C, M j0 , S(j), K^ u V~ x ) 

- n(R -Bo- B K ) + n5{e) 

= nR K - H(Mji © Kj-x\C, M j0 , SO), ^-i) + nS(e) 
= nR K - ^(MjilCMj-o.SO),^--!) +nS(e) = nS(e), 
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where (a) follows from the fact that H(Zi(j)\C, S;(j)) < H(Z i (j)\S i (j)) = H(Z\S) and H(Z l (j)\C, U u S<(i)) = 
H(Z\U,S). Step (b) follows by Lemma 1 which requires that (i) P{([/ n (L), S(j), Z(j)) 6 T t {n) } -> 1 
as ?7, — > oo, and (ii) R — Ro > I(U; Z, S); where (i) can be shown using the same steps as in the analysis 
of probability of error. Step (c) follows by the independence of U and S. Step (d) follows from the 
Markov Chain relation (Z- 3-1 , M j0 , S(j)) -> (i^-i, M j0 , S(j)) -> (Mf X © Mj , S(j'))- The last 

step follows by the fact that Mj\ is independent of (C, M/o, S(j), Kj-i) and uniformly distributed over 
[1 : 2 n * K ]. 
Next, consider the second term 

I(M n ; Z(j)\C, M j0 , S(j), Z^ 1 ) < I(Mji,L; Z(j)\C, M j0 , S(j), Z^ 1 ) 

- J(L; Z(j)|C, M j0 ,M n , SO'), Z^ 1 ) 

< /(EPjZCiJIC.Mj-o.SO'),^'" 1 ) -ff(£|C l M i0> M J ,-i,S(j) J Z'- 1 ) 

+ fr(L|C,M i0 ,Af i i,S(j),Z>') 

< nI(E/-;Z|S) -^(LlC.Mj-o.AO-i.SCi),^'" 1 ) 

+ ^(L|C,M j0 ,M jl ,S(j),Z^) 

< n/(*7; Z|5) - H(L\C, M j0 , M jU S(j), Z^ 1 ) + H(L\M j0 , S(j), Z(j)) 

(b) 

< nI(U; Z\S) - H(L\C, M j0 , M n ,S(j), V~ x ) 

+ n(R - R ) - nI(U; Z, S) + n5(e) 
= n(R - Ro) - H(L\C, M j0 , M jU S(j), V~ x ) + n5(e), 

where (a) follows from the same steps used in bounding I(Mjo; Z(j)\C, S(j), Z J_1 ); (b) follows from 
Lemma 1. Next consider 

H{L\C,M j0 ,M jl ,S(j) t Z'- 1 ) = H(Mji © Kj-i\C, Mj , Mj\, S(j), Z^' 1 ) 

+ H(L\C, M j0 , M iX ,M n © K^, S(j), Z^ 1 ) 
= H(Kj^\C, M j0 , M jU S(j), Z^ 1 ) + n(R - R - R K ) 
= H(Kj-i\C, Z^ 1 ) + n(R -Ro- R K ). 

From Proposition 1, H(Kj-i\C, Z-?" 1 ) > u(Rk — 5(e) — S'(e)), which implies that 

IiMjuZWfaM^SV),^- 1 ) < n(5'(e) + 26(e)). 

This completes the analysis of information leakage rate. 

Rate analysis: From the analysis of probability of error and information leakage rate, we see that the 
rate consttaints are 

R < I(U;Y,S) -5(e), 
R-Ro> I(U;Z,S), 

R K <H(S\Z)- 45(e), 
Ro + i?i < R, 
Ri < Rk, 
R = Ro + R\ ■ 
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Using Fourier-Motzkin elimination (see for e.g. Lecture 6 of [10]), we obtain 



R < 



max 

p(u) ,v(u,s) ,x(u,s) 



min{/(*7; Y, S) - I(U; Z, S) + H(S\Z), I(U; Y, S)} 



(<-<) 



min{I(y; Y\S) - I(V; Z\S) + H(S\Z), I(V; Y\S)} 



max 

p(u) ,v(u,s) ,p(x\s ,v) 



where (a) follows by the independence of U and S and the fact that V is a function of U and S. 

Case 2: ife-csi-l with I(U; Y, S) < I(U; Z, S) 

Under this condition, the decoder cannot rely on the wiretap channel to send a confidential message. 
Therefore, only the key is used to encrypt the message and transmit it securely. Note that we only need 
to consider the case where H(S\Z) - (I(U; Z, S) - I(U; Y, S)) > 0. 

Codebook generation: Codebook generation again consists of two steps. 

Message codebook generation: Let R> Rd and R < R — R d . Randomly and independently generate 
2 nR sequences u n (l), I G [1 : 2 nR ], each according to nr=i^( M «) an ^ partition them into 2 nRd equal-size 
bins C(m d ), rrid G [1 : 2 nRd ]. We further partition the set of sequences in each bin C(rrid) into sub-bins, 

C{m d ,m), m € [1 : 2 nR ]. 

Key codebook generation: We randomly bin the set of s n € S n sequences into 2 nRK bins B(k), 



Encoding: We send 6—1 messages over b n-transmission blocks. In the first block, we randomly select 
a u n (L) sequence. The encoder then computes V{ = v(ui(L), Si), i G [1 : n], and transmits a randomly 
generates sequence X n according to Y\a=i P( x i\ s ii v i)- At the end of the first block, the encoder and 
decoder declare k\ 6 [1 : 2 nRl< ] such that s(l) € B(ki) as the key to be used in block 2. 

Encoding in block j E [2 : b] is as follows. We split the key kj-i into two independent parts, Kj_i d and 
Kj-i im at rates Rd and R, respectively. To send message rrij, the encoder computes mf = rrij ®/e(j_i) m . 
This requires that Rk > R + Rd- The encoder then randomly selects a sequence u n (L) € C{ku_i\d, m'). 
At time i € [(j — l)n + 1 : jn], it computes Vi = v(ui(L), Sj), and transmits a randomly generated 
symbol Xi according to p(xi\si, vi). 

Decoding and analysis of the probability of error: At the end of block j, the decoder declares that I 
is sent if it is the unique index such that (u n (l), Y(j), S(j)) G T e (n) and u n (t) G C{k {j _ l)d ). Otherwise, 
it declares an error. It then finds the index m! such that u n (l) G C(k^_ 1 - )d ,m'). Finally, it recovers rhj 
by computing rhj = rh' ffi k(j_y\ m . Following similar steps to the analysis for Case 1, it can be shown 
that P e -»• as n -> oo if R - R d < I(U; Y, S) - 5(e). 

Analysis of the information leakage rate: Following the same steps as for Case 1 , we can show that 
it suffices to upper bound the terms I(Mj; Z(j)\C, S(j), Z- 5-1 ) for j G [2:6]. Consider 



k G [1 : 2 nR «). 



J(M i ;Ztf)|C,Stf),Z'*- 1 



) = H{M j )-H{M j \C,&{j),2?) 
< H(Mj) - H{M 3 \C, S(j), K u _ 1)d ,Mj K u _ 1)m , TP) 
= H(Mj) - H{M 3 \C, K (j _ 1)d , Mj K u _ 1)m , Z^ 1 ) 
= H(Mj) - H{M 3 \C, V-\K {j _ l)d ) - H{M 3 K {j _ x) 



rn 



C,Z?-\K { 



(j-i)d, M j) 



+ H(Mj K {j _ 1)m \C, TP-\K u _ l)d ) 
= nR - H(Mj) + H(Mj K (l _ 1)m \C, T j -\K { ^ l)d ) 

- H(M 3 © K u _ 1)m \C, Zi-\K (j _ 1)d , Mj) 
< nR - H{K u _ l)m \C, Z?-\K u _ 1)d ). 
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Thus, showing that 

I(K {j _ 1)m ; V-^CKfj.^) < nS'(e), (4) 
H(K {j _ 1)m \C, K (j _ 1)d ) > n(R K - R d - 5(e)) (5) 

implies 

I(M,- Z(j)\C, SO"), Z i_1 ) <nR- n(R K - R d ) + n(5'(e) + -5(e)). 

Hence, the rate of information leakage approaches zero as n — >• oo if R < Rk — Rd- To prove (4) and (5), 
we need the following Proposition. 

Proposition 2: If R > I(U; Z, S) and R K < H(S\Z) - 45(e), then for all j 6 [1 : b], 

1) |C) > n(l?K - *(e)). 

2) I(Kj;Z(j)\C) <3n5(e). 

3) 1(2^; Z»|C) < n8'(e), where 5(e) -4 and S'(e) -> as e ->■ 0. 
The proof of this Proposition is given in Appendix II. 

Part 3 of Proposition 2 implies (4), since 

IiKj^ZP-^C) = I(K (J _ 1)d ,K ( j_ 1)m ;ZP- 1 \C) 

= I(K (j _ 1)d ; Z^\C) + I(K u _ 1)m ; Z^\C, K {j _ l)d ). 

Part 1 of Proposition 2 implies (5), since H(Ku_x\\C) = H(K^_i^ m , K^_ 1 ^ d \C) > u(Rk — 5(e)), 
which implies that H(K^_ 1 - )m \C, K^_ 1 - )d ) > ti{Rk — R d — 5(e)). 
Rate analysis: The following rate constraints are necessary for Case 2. 

R > I(U;Z, S), 
R-R d <I(U;Y,S)-S(e), 

R f; R — Rdi 

R K < H(S\Z) - 45(e), 
R < -Rk — Rd- 
Using Fourier Motzkin elimination, we obtain 

R< max mm{I(U;Y,S) - I(U;Z,S) + H(S\Z),I{U;Y,S)} 

p(u),v(u,s) 

max mm{I(V;Y\S) -I(V;Z\S) + H(S\Z),I(V;Y\S)}. 

p(u) ,v(u,s) ,p(x\s ,v) 

Case 3: i? S -CSI-2 

For i?s-CSi-2, the key generated in a block is used purely to encrypt the message in the following block. 
This implies that there is a possibility that the eavesdropper can decode the codeword transmitted in the 
current block, which reduces the key rate that can be generated at the current block. This is compensated 
for by the fact that the entire key is used for message transmission. The codebook generation, encoding 
and analysis of probability of error and equivocation are therefore similar to that in Case 2. 

Codebook generation: Codebook generation again consists of two steps. 

Message codebook generation: Randomly and independently generate 2 nR sequences v n (l), I 6 [1 : 
2 nR ], each according to nr=iP( u 0- 

Key codebook generation: Set Rjc = R. We randomly bin the set of s n G S n sequences into 2 nRl< 
bins B(k), k G [1 : 2 nR *]. 
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Encoding: We send 6 — 1 messages over b n-transmission blocks. In the first block, we randomly 
select a v n (L) sequence. The encoder then transmits a randomly generated sequence X n according to 
\Xi=\P{ x i\ s ii v i)- At the en d °f the first block, the encoder and decoder declare k\ G [1 : 2 nRl< ] such 
that s(l) G B{k\) as the key to be used in block 2. 

Encoding in block j G [2 : b] is as follows. To send message rrij, the encoder computes m' = rrij®kj-i. 
The encoder then selects the sequence v n (m'). At time i G [(j — l)n + 1 : jn], it transmits a randomly 
generated symbol Xi according to p(xi\si,Vi). 

Decoding and analysis of the probability of error: At the end of block j, the decoder declares that rh' 
is sent if it is the unique index such that (v n (rh'),Y(j), S(j)) G T^' '. Otherwise, it declares an error. It 
then recovers rhj by computing mj = rh' © fej-i. Following similar steps to the analysis for Case 1, it 
can be shown that P e as n -> oo if R < I(V; Y, S) - 6(e). 

Analysis of the information leakage rate: Following the same steps as for Case 1 , we can show that 
it suffices to upper bound the terms I(Mj; Z(j)\C, S(j), Z- 5-1 ) for j G [2 : b]. Consider 

J(M i; Z(j)\C, S(j), TP' 1 ) = H(Mj) - H(Mj\C, S(j), TP) 

< H(Mj) - H(Mj\C, S(j), Mj © Kj-x, V) 
= H(Mj) - H(Mj\C, Mj © Kj-!, 7P~ X ) 

= H(Mj) - H(Mj © Kj.^MjlC, Z j ~ x ) + H(Mj © K^C, TJ~ X ) 
<nR- H(Mj\C, TJ~ X ) - H(Mj © K^ X \C, ,Mf) + nR 
= nR- H{K^ x \C,TJ~ x ). 

Thus, showing that 

HKj-^Zp-^C) <n5'{e), (6) 
H^K^C) >n(R K -S(e)) (7) 

implies 

I{M.y, Z(j)\C, S(j), Z^ 1 ) < n(S'(e) + 5(e)). 

To prove (6) and (7), we will use the following Proposition 

Proposition 3: If R K < H(S\Z, V) - 45(e), then for all j G [1 : b], 

1) H(K 3 \C)>n(R K -5(e)). 

2) I(K j ;Z(j)\C)<3n5(e). 

3) /(iiCj ; Z»|C) < n5'(e), where 5(e) and 5'(e) ->■ as e ->■ 0. 

The proof of this Proposition is given in Appendix III. It is clear that equations (6) and (7) are implied 
by Proposition 3, which completes the analysis of information leakage rate. 

Rate analysis: The following rate constraints are necessary for Case 3. 

R = Rk, 

R<I(V;Y,S)-5(e), 
R K < H(S\Z,V) -45(e). 

These constraints imply the achievability of 

R< max mm{H(S\Z,V),I(V;Y\S)}. 

p(v)p(x\s,v) 
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V. Proof of Theorem 2 

For any sequence of codes with probability of error and leakage rate that approach zero as n — > oo, 
consider 

(a) 

nR = H(M) < I(M; Y n , S n ) + ne n 

< I(M; Y n , S n ) - I(M; Z n ) + 2ne n 

n 

= £(I(M; Yi, SilY^, Sf +1 ) - J(M; Z^" 1 )) + 2ne n 
i=l 

n 

( = } J>(M, Z*" 1 ; y i( S^, Sf +1 ) - I(M, Y^S?^; Z^ZT 1 )) + 2ne„ 

i=l 
n 

@ £(I(M; Y^SilY^S^Z^ 1 ) - I(M; Z { \Y^ X , Sf +1 , Z^ 1 )) + 2ne n 
i=l 

( = } J2{I(Vn; Y u Si\Ui) - I(V U ; Zi\Ui)) + 2ne n 

i=l 
n 

= J2(I(V li ;Y u S l \U l )-I(V li ;Z u S l \U l ) + I(V li ;S i \Z i ,U i )) + 2ne n 
i=l 

n 

< Y,( I (ViuY l ,S l \U l )-I(V li ;Z l ,S l \U l ) + H(S i \Z l ,U i )) + 2ne n 

8=1 

n 

< FilC/i, Si) - /(Fi <; Z;, Si\U it Si) + fl-^IZi, UO) + 2ne n 
i=i 

= n(J(Fi; Y\U, S) - I{Vr,Z\U, S) + H(S\Z, U)) + 2ne n , 

where (a) follows by Fano's inequality; (6) follows from the secrecy condition; (c) and (d) follows the 
Csiszar sum identity; (e) follows from denning Ui = (Y^ v S™ +1 , Z^ 1 ) and V u = (M, Y t n +1 , Sf +1 , Z^ 1 ); 
and (/) follows from setting Q to be a uniform random variable over [1 : n], independent of all other 
variables, and defining U = (Uq, Q), V x = (V 1Q , Q), S = Sq, Y = Yq and Z = Z Q . 
For the second upper bound, we have 

nR < I{M;Y n ,S n ) + ne n 

( = } I{M-Y n \S n )+ne n 

n 

= Y / HM;Y l \S n ,Y i n +1 ) 

i=l 
n 

i=l 
n 

i=i 

= nI(V 2Q ;Y\S,Q) 
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< nI(V 2 ;Y\S), 

where (a) follows from the independence between M and S n ; (b) follows from defining 
V 2i = (M, Yfi_ v Z i_1 , and (c) follows from defining V 2 = (V 2Q ,Q). 

VI. Conclusion 

We established bounds on the secrecy capacity of the wiretap channel with state information causally 
available at the encoder and decoder. We showed that our lower bound can be strictly larger than the 
best known lower bound for the noncausal state information case. The upper bound holds when the state 
information is available noncausally at the encoder and decoder. We showed that the bounds are tight for 
several classes of wiretap channels. 

We used key generation from state information to improve the message transmission rate. It may be 
possible to extend this idea to the case when state information is available only at the encoder. This case, 
however, is not straightforward to analyze since it would be necessary for the encoder to reveal some 
state information to the decoder (and hence partially to the eavesdropper) in order to agree on a secret 
key. This may reduce the wiretap coding part of the rate. 

References 

[1] A. D. Wyner, "The wire-tap channel," Bell System Technical Journal, vol. 54, no. 8, pp. 1355-1387, 1975. 
[2] I. Csiszar and J. Korner, "Broadcast channels with confidential messages," IEEE Trans. Inf. Theory, vol. IT- 24, pp. 339-348, 
May 1978. 

[3] Y. Chen and A. J. Han Vinck, "Wiretap channel with side information," IEEE Trans. Inf. Theory, vol. 54, no. 1, pp. 
395-402, 2006. 

[4] W. Liu and B. Chen, "Wiretap channel with two-sided state information," in Proc. 41st Asilomar Conf. Signals, Systems 

and Comp., Pacific Grove, CA, Nov. 2007, pp. 893-897. 
[5] A. Khisti, S. N. Diggavi, and G. W. Womell, "Secret key agreement using asymmetry in channel state knowledge," in 

Proc. IEEE International Symposium on Information Theory, Seoul, South Korea, July 2009, pp. 2286-2290. 
[6] U. M. Maurer, "Secret key agreement by public discussion from common information," IEEE Trans. Inf. Theory, vol. 39, 

no. 3, pp. 733-742, 1993. 

[7] R. Ahlswede and I. Csiszar, "Common randomness in information theory and cryptography — I: Secret sharing," IEEE 

Trans. Inf. Theory, vol. 39, no. 4, 1993. 
[8] C. E. Shannon, "Channels with side information at the transmitter," IBM J. Res. Develop., vol. 2, pp. 289-293, 1958. 
[9] E. Ardestanizadeh, M. Franceschetti, T. Javidi, and Y. H. Kim, "Wiretap channel with rate-limited feedback," in Proc. 

IEEE International Symposium on Information Theory, Toronto, Canada, July 2008, pp. 101-105. 
[10] A. El Gamal and Y. H. Kim, "Lectures on network information theory," 2010, available online at ArXiv: 

http://arxiv.org/abs/1001.3404. 

[11] F. M. J. Willems and E. C. van der Meulen, "The discrete memoryless multiple-access channel with cribbing encoders," 

IEEE Trans. Inf. Theory, vol. 31, no. 3, pp. 313-327, 1985. 
[12] J. Korner and K. Marton, "Comparison of two noisy channels," in Topics in Information Theory (Second Colloq., Keszthely, 

1975), 1977, pp. 411-423. 

[13] H. Yamamoto, "Rate-distortion theory for the Shannon cipher system," IEEE Trans. Inf. Theory, vol. 43, no. 3, pp. 827-835, 
1997. 

[14] C. E. Shannon, "Communication theory of secrecy systems," Bell Systems Tech. J., vol. 28, pp. 656-715, 1949. 
[15] Y. K. Chia and A. El Gamal, "3-receiver broadcast channels with common and confidential messages," 2009, submitted 
to IEEE Trans. Inf. Theory. Available online at ArXiv: http://arxiv.org/abs/0910.1407. 

Appendix I 
Appendix: Proof of Proposition 1 

1. The proof of this result follows largely from Lemma 2 in Lecture 23 of Lectures on Network 
Information Theory by El Gamal and Kim [10]. For completeness, we give the proof here. Consider 

H{Kj\C) > P{S n G T e {n) }H{K 3 \C,S{j) g T e {n) ) 

>(i-6;)iJ(^|c,s(i)e^)). 
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Let P(kj) be the random pmf of Kj given {S(j) € 7? }, where the randomness is induced by the 
random bin assignment (codebook) C. 

By symmetry, P(kj), kj G [1 : 2 nRl< ], are identically distributed. We express P(l) in terms of a 
weighted sum of indicator functions as 

Pi 1 )- 7. — ivTT ■ I {s"eB{i)}- 

It can be easily shown that 

Ec(P(l)) = 2- nR * ) 

\ 2 

p(s n ) \ 



Var(P(l)) =2- nRK (l-2- nRK ) ^ ( 



2-2n(H(5)-5(e)) 



VP{S(i)GTi n) } 



(1 " <) 2 
< 2 -n(fljf+-ff(S)-4 l 5(e)) 

for sufficiently large n. 

By the Chebyshev inequality, 

Var(P(l)) 



P{|P(l)-E(P(l))|>eE(P(l))}< 



< 



(6E(P(1)))2 
2 -n{H{S)-R K -45{e)) 



C 2 



Note that if Rk < H(S) — 45(e), this probability -> as n -> oo. Now, by symmetry 
#(tfi|C,S(j)e7™) 

= 2 nR K E(P(1)) log(l/P(l))) 

> 2 rii? *P{|P(l) -E(P(1))| < e2- ni?K }E(P(l)log(l/P(l)) | |P(1) - E(P(1))| < e2~ nRK ) 

> 1 5 • (nP^(l - e) - (1 - e) log(l + e)) 



c 



> n(P K - 5(e)) 

for sufficiently large n and Rk < H(S) — 45(e). 

Thus, we have shown that if Rk < H{S) — 45(e), H(Kj\C) > u(Rk — 5(e)) for n sufficiently large. 
This completes the proof of part 1 of Proposition 1. Note now that since H(S\Z) < H(S), the same 
results also holds if R K < H(S\Z) - 45(e). 

2. We need to show that if R K < H(S\Z) - 35(e), then I{K 3 ; Z(j)\C) < 2n5(e) for every j 6 [1 : b]. 
We have 

I(Kj;Z(j)\C) =/(S(j);Z(j)|C) - /(S(j); Z(j)|^,C). 
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We analyze the terms separately. For the first term, we have 

I(S(j);Z(j)\C) = I(S(j),L;Z(j)\C) - I(L;Z(j)\S(j),C) 

< I(U n , S(j); Z\C) -H(L\S(j),C) + H(L\S(j),Z n ) 

< nI(U, S; Z) - H(L\S(j),C) + H(L\S(j), Z n ) 

< nI(U, S; Z) - H(L\S(j),C) + n(R - I(U; Z, S) + 6(e)) 
= nR - H(M j0 \C) - H(Mji © Kj-^C, M j0 ) 

- H(L\Mjo,Mji © Kj-uC) + nI(S; Z) + n5(e) 

< nR - nRo - H(M jX © K,_i|C, M j0 , Kj-x) - n(R - R - R K ) + nI(S; Z) + n5(e) 
= nR K - H(Mj!\C, M j0 , Kj-\) + nI(S; Z) + n6(e) 

= n(I(S;Z) + 5(e)), 

where step (a) follows from application of Lemma 1 which holds since R — Ro > I(U; Z, S). For the 
second term we have 

I{SU);ZU)\K j ,C) = H(8(j)\K j ,C)-H(S(j)\ZU),K j ,C) 

= H(S(j),K j \C)-H(K j \C)-H(S(j)\Z(j),K j ,C) 

> nH(S) - nR K - H(S(j)\Z(j), Kj ,C) 

> n(H(S) - R K ) - £T(S(i)|Z(i),i^) 

(6) 

> n{H{S) - R K ) - n{H{S\Z) - R K + 5'(e)) 
= nI(S; Z) — nS(e), 

where (b) follows from showing that H(S(j)\Z(j), Kj) < n(H(S\Z) - R K + 5(e)). This requires the 
condition Rk < H(S\Z) — 35(e). Combining the bounds for the 2 expressions gives I(Kj;Z(j)\C) < 
2n5(e). 

Proof of step (b): Give an arbitrary ordering to the set of all state sequences s n with S(j) = s n (T) for 
some T G [1 : 2 nlo ^ s \]. Hence, H(S(j)\Z(j), K) = H(T\K,Z(j)). 

From the coding scheme, we know that P{(s n (T), Z(j)) G T e (n) } -> 1 as n — > oo. Note here that T 
is random and corresponds to the realization of S n . 

Now, fix T = t, Z(j) = z n , K = k and define N(z n ,k,t) := \l G [1 : |71 (n) (5)|] : (s n (!),z n ) G 
Te [n \ l^t, s n (l) G B(k)\. For z n <£ 71 (n) , N(z n , k,t) = 0. For z n G T e (n) , it is easy to show that 

\% {n) (S\Z)\ - 1 _ t |7; {n) (5|Z)| 



< E(iV(z n ,M)) < 
Var(iV(z n ,M)) < 



\Te [n \S\Z)\ 



By the Chebyshev inequality, 

Var(N(z n ,k,t)) 



P{N(z n , k, *)>(! + e) E(iV(>", k, t))} < 



(eE(N(z",k,tW 

2~n(H(S\Z)-3S(e)-R K ) 
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Note that P{N(z n , k, t) > (1 + e) E(N(z n , k, t))} ->-0asrt->-ooifi?< H(S\Z) - 35(e). Now, define 
the following events 

S 1 :={(S(j),Z(j))^7>)}, 

£ 2 := {N(Z(j), K, T) > (1 + e) E(iV(Z(j), K, T))}. 
Let E = if £ f n £| occurs and 1 otherwise. We have 
P(E = l)<P(f 1 )+P(f 2 ) 

<P(£l)+ J] p(z n ,t,fc)P{#(* n .M) > (l + e)E(N(z n ,k,t))} 

( 2 ",s"(t))Gr £ (n) , fe 

+ p{(^(T),z(j))^r e W}. 

P{(s n (T), Z(j)) ^ 7e } = P(£i) and P(£i) — >• as n — )• oo by the coding scheme. For the second term, 
P{N (z n , k, t) > (1 + e) E(N(z n , k, t))} -> as n ->• oo if R < H{S\Z) - 35(e). Hence, P(E = 1) -> 
as n -»• oo if if i2 < #(S|Z) - 35(e). 
We can now bound H(T\K, Z n ) by 

#(T|K, Z n ) < 1 + P(E = 1)H(T\K, Z n , £7 = 1) + #(T|2f, Z n , £7 = 0) 
< n(#(S|Z) -i? K + 5(e)). 



3. To upper bound I{Kj\TJ\C\ we use an induction argument assuming that I{Kj-\\TJ l \C) < 
n5j_i(e), where 5j_i(e) — > as e —> 0. Note that the proof for j = 2 follows from part 2. Consider 

I(K f , 2P\C) = I(Kj; Z(j)\C) + I{K fl Z^ l \C, Z(j)) 

<2n5(e)+I(K ] ;Z^ 1 \C,Z(j)) 

= Htf-^CZV)) - H(Z^ 1 \C,Z(j),K J ) + 2n5(e) 

< HiZP-^C) - HiV-Y, Kj-i, Z(j), Kj) + 2n5(e) 
( = } H{Z^ l \C) - H{Z j - l \C,K j ^ l ) + 2ra5(e) 

= I(K j - 1 ;Z j - 1 \C) + 2nS(e) 

(c) 

< n5j-i(e) + 2n5(e), 

where (a) follows from part 2 of the Proposition; (b) follows from the Markov Chain relation Z^ 1 — > 
Kj_i — > (Z(j),Kj); (c) follows from the induction hypothesis. This completes the proof since the last 
line implies that there exists a S'(e), where 5'(e) — > as e — > 0, that upper bounds I(Kj;Z?\C) for 
i G [1 : 6]. 

Appendix II 
Proof of Proposition 2 

1. We first show that if R K < H(S) - 45(e), then H(Kj\C) > n(R K - 5(e)). This is done in the 
same manner as 1 of Proposition 1 . The proof is therefore omitted. 
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2. We need to show that if R K < H(S\Z) - 35(e), then I{Kf, Z(J)\C) < 2n5(e) for every j 6 [1 : b). 
We have 

I(K f Z(j)\C) =/(S(i);Z(i)|C)-7(S(i);Z(i)|^,C). 
We analyze the terms separately. For the first term, we have 

I(S(j);Z(j)\C) = I(S(j),L;Z(j)\C) - I(L;Z(j)\S(j),C) 

< I(U n ,S(j);Z\C) - H(L\S(j),C) + H(L\S(j),Z n ) 

< nI(U, S; Z) - H(L\S(j),C) + H(L\S(j), Z n ) 

to 

< nI(U, S; Z) - H(L\C) + n(R - I(U; Z, S) + 5(e)) 

= nR- H(K u _ 1)d \C) - H(K {j _ 1)m © Mj\C) + nI(S; Z) 
- H(L\K (j _ l)m Mf,# _ 1)d ) + n5(e) 

(b) ~ 

< n(R- R d - R- R + R d + R + 25(e)) + nI(S; Z) 
= n(I(S;Z) + 25(e)), 

where step (a) follows from application of Lemma 1, which holds from the condition that R > I(U ; Z, S), 
and the fact that S(j) is independent of L. Step (b) follows from part 1 of Proposition 2: H(Kj^\\C) > 
ti(Rk — 5(e)), which implies that H(K^_i^ d \C) > n(R d — 5(e)). Note we implicitly assumed j > 2. The 
case of j = 1 is straightforward, since H(L\C) = nR by the fact that we transmit a codeword picked 
uniformly at random. 

The proof that I(S(j); Z(j)\Kj ,C) > nI(S; Z) — n5(e) follows the same steps as the proof of part 2 
of Proposition 1 and requires the same condition that R% < H(S\Z) — 35(e). 

3. Part 3 of the Proposition is proved in the same manner as part 3 of Proposition 1. 

Appendix III 
Proof of Proposition 3 

1. We first show that if R K < H(S) - 45(e), then H(Kj\C) > n(R K - 5(e)). This is done in the 
same manner as 1 of Proposition 1 . The proof is therefore omitted. 

2. We need to show that if R K < H(S\Z, V) -35(e), then I(Kj; Z(j)\C) < n5(e) for every j G [1 : b}. 
We have 

I(K f ,Z(j)\C)<I(Kj;Z(j),U n \C) 

= I(S(j); Z(j), U n \C) - I(S(j); Z(j), U n \K„C). 
We analyze the terms separately. For the first term, we have 
I(S(j);Z(j),V n \C) = I(S(j);Z(j)\V n ,C) 



= J2(H(Ziij)\C, V n , Z^(j)) - H(Z l (j)\C, V n , S(j), Z*~ l (j))) 
i=i 

n 

< Y,(H(Zi(j)\C, Vi) - H(Z l (j)\C, V,, Si(j))) 



i=l 



< n(H(Z\V) - H(Z\V,S)) 
= nI(Z;S\V) = nI(Z,V;S). 
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For the second term, we have 

/(S(j);Z(j),n^,C) = H(S(j)\Kj,C) ~ H(S(j)\Z(j),V\ K„C) 

= ff(S(i),^|C) -H(Kj\C) -H(S{j)\ZU),V n ,Kj,C) 

> nH{S)-nR K - H(S(j)\Z(j),V n , Kj,C) 

> n(H(S) - R K ) - i?(S(i)|Z(i), y",^) 
(6) 

> n{H(S) - R K ) - n{H(S\Z, V) - R K + 5'{e)) 
= nI(S;Z,V) -nS(e), 

The proof of step (6) follows the same steps as in the proof of part 2 of Proposition 1 . We can show 
that step (b) holds if R K < H(S\Z, V) - 36(e). 

Combining the two terms then give the required upper bound which completes the proof of Part 2. 

3. Part 3 of the Proposition is proved in the same manner as part 3 of Proposition 1. 
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